REMI: REGRESSION WITH MARGINAL INFORMATION AND ITS APPLICATION IN GENOME-WIDE ASSOCIATION STUDIES

نویسندگان

چکیده

In this study, we consider the problem of variable selection and estimation in high-dimensional linear regression models when complete data are not accessible, but only certain marginal information or summary statistics available. This is motivated from Genome-wide association studies (GWAS) that have been widely used to identify risk variants underlying complex human traits/diseases. With a large number completed GWAS, statistical methods using become more important because restricted accessibility individual-level sets. Theoretically guaranteed highly demanding advance inference with amount available information. Here propose an $\ell_1$ penalized approach, REMI, estimate high dimensional coefficients external reference samples. We establish upper bound on error REMI estimator, which has same order as minimax Lasso data. particular, obtained samples together small samples, yields good prediction results, outperforms sample size accessible can be limited. Through simulation real analysis NFBC1966 GWAS set, demonstrate applicable. The developed R package codes reproduce all results at https URL

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genome Wide Association Studies, Next Generation Sequencing and Their Application in Animal Breeding and Genetics: A Review

Recently genetic studies have been revolutionized by next generation sequencing (NGS) technology, and it is expected that the use of this technology will largely eliminate defects in the methods of association studies. The NGS technology is becoming the premier tool in genetics. However, at the moment the use of this method is limited especially in the livestock due to high cost and computation...

متن کامل

Annotation Regression for Genome-Wide Association Studies with an Application to Psychiatric Genomic Consortium Data.

Although genome-wide association studies (GWAS) have been successful at finding thousands of disease-associated genetic variants (GVs), identifying causal variants and elucidating the mechanisms by which genotypes influence phenotypes are critical open questions. A key challenge is that a large percentage of disease-associated GVs are potential regulatory variants located in noncoding regions, ...

متن کامل

Regularized regression method for genome-wide association studies

We use a novel penalized approach for genome-wide association study that accounts for the linkage disequilibrium between adjacent markers. This method uses a penalty on the difference of the genetic effect at adjacent single-nucleotide polymorphisms and combines it with the minimax concave penalty, which has been shown to be superior to the least absolute shrinkage and selection operator (LASSO...

متن کامل

Genome-wide Association Studies

Progress in probabilistic generative models has accelerated, developing richer models with neural architectures, implicit densities, and with scalable algorithms for their Bayesian inference. However, there has been limited progress in models that capture causal relationships, for example, how individual genetic factors cause major human diseases. In this work, we focus on two challenges in par...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Statistica Sinica

سال: 2021

ISSN: ['1017-0405', '1996-8507']

DOI: https://doi.org/10.5705/ss.202019.0182